10 research outputs found
Complexity analysis of regularization methods for implicitly constrained least squares
Optimization problems constrained by partial differential equations (PDEs)
naturally arise in scientific computing, as those constraints often model
physical systems or the simulation thereof. In an implicitly constrained
approach, the constraints are incorporated into the objective through a reduced
formulation. To this end, a numerical procedure is typically applied to solve
the constraint system, and efficient numerical routines with quantifiable cost
have long been developed. Meanwhile, the field of complexity in optimization,
that estimates the cost of an optimization algorithm, has received significant
attention in the literature, with most of the focus being on unconstrained or
explicitly constrained problems.
In this paper, we analyze an algorithmic framework based on quadratic
regularization for implicitly constrained nonlinear least squares. By
leveraging adjoint formulations, we can quantify the worst-case cost of our
method to reach an approximate stationary point of the optimization problem.
Our definition of such points exploits the least-squares structure of the
objective, leading to an efficient implementation. Numerical experiments
conducted on PDE-constrained optimization problems demonstrate the efficiency
of the proposed framework.Comment: 21 pages, 2 figure
Direct search based on probabilistic descent in reduced spaces
Derivative-free algorithms seek the minimum value of a given objective
function without using any derivative information. The performance of these
methods often worsen as the dimension increases, a phenomenon predicted by
their worst-case complexity guarantees. Nevertheless, recent algorithmic
proposals have shown that incorporating randomization into otherwise
deterministic frameworks could alleviate this effect for direct-search methods.
The best guarantees and practical performance are obtained when employing a
random vector and its negative, which amounts to drawing directions in a random
one-dimensional subspace. Unlike for other derivative-free schemes, however,
the properties of these subspaces have not been exploited.
In this paper, we study a generic direct-search algorithm in which the
polling directions are defined using random subspaces. Complexity guarantees
for such an approach are derived thanks to probabilistic properties related to
both the subspaces and the directions used within these subspaces. By
leveraging results on random subspace embeddings and sketching matrices, we
show that better complexity bounds are obtained for randomized instances of our
framework. A numerical investigation confirms the benefit of randomization,
particularly when done in subspaces, when solving problems of moderately large
dimension
A Subsampling Line-Search Method with Second-Order Results
In many contemporary optimization problems such as those arising in machine
learning, it can be computationally challenging or even infeasible to evaluate
an entire function or its derivatives. This motivates the use of stochastic
algorithms that sample problem data, which can jeopardize the guarantees
obtained through classical globalization techniques in optimization such as a
trust region or a line search. Using subsampled function values is particularly
challenging for the latter strategy, which relies upon multiple evaluations. On
top of that all, there has been an increasing interest for nonconvex
formulations of data-related problems, such as training deep learning models.
For such instances, one aims at developing methods that converge to
second-order stationary points quickly, i.e., escape saddle points efficiently.
This is particularly delicate to ensure when one only accesses subsampled
approximations of the objective and its derivatives.
In this paper, we describe a stochastic algorithm based on negative curvature
and Newton-type directions that are computed for a subsampling model of the
objective. A line-search technique is used to enforce suitable decrease for
this model, and for a sufficiently large sample, a similar amount of reduction
holds for the true objective. By using probabilistic reasoning, we can then
obtain worst-case complexity guarantees for our framework, leading us to
discuss appropriate notions of stationarity in a subsampling context. Our
analysis encompasses the deterministic regime, and allows us to identify
sampling requirements for second-order line-search paradigms. As we illustrate
through real data experiments, these worst-case estimates need not be satisfied
for our method to be competitive with first-order strategies in practice
A Nonmonotone Matrix-Free Algorithm for Nonlinear Equality-Constrained Least-Squares Problems
Least squares form one of the most prominent classes of optimization problems, with numerous applications in scientific computing and data fitting. When such formulations aim at modeling complex systems, the optimization process must account for nonlinear dynamics by incorporating constraints. In addition, these systems often incorporate a large number of variables, which increases the difficulty of the problem, and motivates the need for efficient algorithms amenable to large-scale implementations.
In this paper, we propose and analyze a Levenberg-Marquardt algorithm for nonlinear least squares subject to nonlinear equality constraints. Our algorithm is based on inexact solves of linear least-squares problems, that only require Jacobian-vector products. Global convergence is guaranteed by the combination of a composite step approach and a nonmonotone step acceptance rule. We illustrate the performance of our method on several test cases from data assimilation and inverse problems: our algorithm is able to reach the vicinity of a solution from an arbitrary starting point, and can outperform the most natural alternatives for these classes of problems
Detecting negative eigenvalues of exact and approximate Hessian matrices in optimization
Nonconvex minimization algorithms often benefit from the use of second-order
information as represented by the Hessian matrix. When the Hessian at a
critical point possesses negative eigenvalues, the corresponding eigenvectors
can be used to search for further improvement in the objective function value.
Computing such eigenpairs can be computationally challenging, particularly if
the Hessian matrix itself cannot be built directly but must rather be sampled
or approximated. In blackbox optimization, such derivative approximations are
built at a significant cost in terms of function values.
In this paper, we investigate practical approaches to detect negative
eigenvalues in Hessian matrices without access to the full matrix. We propose a
general framework that begins with the diagonal and gradually builds
submatrices to detect negative curvature. Crucially, our approach applies to
the blackbox setting, and can detect negative curvature comparable to the
approximation error. We compare several instances of our framework on a test
set of Hessian matrices from a popular optimization library, and
finite-differences approximations thereof. Our experiments highlight the
importance of the variable order in the problem description, and show that
forming submatrices is often an efficient approach to detect negative
curvature
A Stochastic Levenberg--Marquardt Method Using Random Models with Complexity Results
Globally convergent variants of the Gauss-Newton algorithm are often the
methods of choice to tackle nonlinear least-squares problems. Among such
frameworks, Levenberg-Marquardt and trust-region methods are two
well-established, similar paradigms. Both schemes have been studied when the
Gauss-Newton model is replaced by a random model that is only accurate with a
given probability. Trust-region schemes have also been applied to problems
where the objective value is subject to noise: this setting is of particular
interest in fields such as data assimilation, where efficient methods that can
adapt to noise are needed to account for the intrinsic uncertainty in the input
data.
In this paper, we describe a stochastic Levenberg-Marquardt algorithm that
handles noisy objective function values and random models, provided sufficient
accuracy is achieved in probability. Our method relies on a specific scaling of
the regularization parameter, that allows us to leverage existing results for
trust-region algorithms. Moreover, we exploit the structure of our objective
through the use of a family of stationarity criteria tailored to least-squares
problems. Provided the probability of accurate function estimates and models is
sufficiently large, we bound the expected number of iterations needed to reach
an approximate stationary point, which generalizes results based on using
deterministic models or noiseless function values. We illustrate the link
between our approach and several applications related to inverse problems and
machine learning.Comment: To appear in SIAM/ASA J. Uncertain. Quantif., 202
A Komatiite Succession as an Analog for the Olivine Bearing Rocks at Jezero
The Mars 2020 rover landed at Jezero crater on February 18, 2021. Since then, the rover has traveled around the âSĂ©Ătahâ region and has collected data from the Mastcam-Z, Supercam, PIXL and SHERLOC instruments that has led to insights into the formation of the olivine-clay-carbonate bearing rocks that were identified from orbit. Here we discuss three questions: 1) What have we learned about the olivine-clay- carbonate unit? 2) What terrestrial analogs exist for the unit? 3) Why do the rocks have a thinly layered morphology? We shall briefly mention instrumental measurements which provide important information regarding the olivine bearing rock at Seitah